A Reference Architecture and Road map for Enabling E- commerce on Apache Spark
نویسندگان
چکیده
Apache Spark is an execution engine that besides working as an isolated distributed, in-memory computing engine also offers close integration with Hadoop’s distributed file system (HDFS). Apache Spark's underlying appeal is in providing a unified framework to create sophisticated applications involving workloads. It unifies multiple workloads, handles unstructured data very well and has easy-to-use APIs. Apache Spark also offers a streaming component called Spark Streaming, which can write the streamed data in the same data structures, also resides in-memory and can also be read by the Spark’s Spark SQL component running on top of core Spark framework. Apache Spark has the ability to provide online machine learning, through its MLlib, and SparkR sub projects. With these, besides streaming data it can also execute machinelearning libraries, functions or algorithms. This paper analyzes Apache Spark and highlights the role of Apache Spark (and eco-system) in the architecture of a modern E-commerce platform. This paper also aims to propose horizontally and vertically scalable reference architectures for both small and medium (SME) & large Ecommerce enterprises. General Terms Apache Hadoop, Big Data, Distributed Computing, Analytics & Parallel Data Processing.
منابع مشابه
Rightinsight: open source architecture for data science
We give the details of our reference architecture called RightInsight for enabling rapid data science. RightInsight is based purely on open source technologies. The data is stored in a standard distributed file system such as HDFS. The stored data is processed in Apache Spark, which provides an enhanced Map/Reduce programming environment. Its rich and powerful machine learning base makes it eas...
متن کاملEvaluation of the effective factors in accepting e-commerce to develop a handmade carpet economy
Nowadays, the status of e-commerce in the exchange of art works is the subject of study experts in the field of art economics. Considering the importance of this issue, identifying the effective factors in accepting e-commerce in this sector of the economy is essential. Hence, using this technology in the art sector, especially the handmade carpet exchanges, we can overcome the problems in the...
متن کاملReal-time Text Analytics Pipeline Using Open-source Big Data Tools
Real-time text processing systems are required in many domains to quickly identify patterns, trends, sentiments, and insights. Nowadays, social networks, e-commerce stores, blogs, scientific experiments, and server logs are main sources generating huge text data. However, to process huge text data in real time requires building a data processing pipeline. The main challenge in building such pip...
متن کاملRisk Analysis in E-commerce via Fuzzy Logic
This paper describes the development of a fuzzy decision support system (FDSS) for the assessment of risk in E-commerce (EC) development. A Web-based prototype FDSS is suggested to assist EC project managers in identifying potential EC risk factors and the corresponding project risks. A risk analysis model for EC development using a fuzzy set approach is proposed and incorporated into the FDSS....
متن کاملAn E-Commerce Framework
-Today more and more companies are using ecommerce solutions to offer new alternatives for distributing their goods. However this kind of distribution is becoming increasingly competitive. Companies are urged to tailor their solutions and to add customer oriented services in order to meet the end user requirements. The purpose of an e-commerce framework is to support the implementation of tailo...
متن کامل